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Abstract 


Research on automatic software repair is concerned with the develop- 
ment of systems that automatically detect and repair bugs. One well-known 
class of bugs is the infinite loop. Every computer programmer or user has, at 
least once, experienced this type of bug. We state the problem of repairing 
infinite loops in the context of test-suite based software repair: given a test 
suite with at least one failing test, generate a patch that makes all test cases 
pass. Consequently, repairing infinites loop means having at least one test 
case that hangs by triggering the infinite loop. Our system to automatically 
repair infinite loops is called In finitel. We develop a technique to manip- 
ulate loops so that one can dynamically analyze the number of iterations 
of loops; decide to interrupt the loop execution; and dynamically examine 
the state of the loop on a per-iteration basis. Then, in order to synthesize a 
new loop condition, we encode this set of program states as a code synthesis 
problem using a technique based on Satisfiability Modulo Theory (SMT). 
We evaluate our technique on seven seeded-bugs and on seven real-bugs. 
Infinitel is able to repair all of them, within seconds up to one hour on a 
standard laptop configuration. 


1 Introduction 


Research on automatic software repair is concerned with the development of sys- 
tems that automatically detect and repair bugs. We consider as bug a behavior 
observed during program execution that does not correspond to the expected 
one. Automatic software repair is close to other research areas such as automatic 
debugging, software testing, program synthesis, machine learning for software en- 
gineering. There have been a number of results in this field (e.g. [I] 74 EN [13}), 
since seminal work at the end of the 2000ies [I] [7 [5]. 

The ultimate goal of automatic software repair is to minimize the maintenance 
costs. Software maintenance is often considered the most expensive development 
phase [8] and a key task during maintenance is the correction of bugs (colloquially 
“bug fixing”). The automatic repair of even a fraction of software bugs would 
translate to significant savings in developer time and costs. 

Hamill and Goseva-Popstojanova [8| showed that one of the most common 
types of software faults are coding faults. That is, faults directly in the source 


# Commit message: Fix hang on ’grep --color "" anything’ 

- while ((match_offset = (*execute) (beg, lim - beg, &match_size, 1)) != (size_t) -1) { 

+ while (lim-beg && (match_offset = (*execute) (beg,lim - beg,&match_size,1)) != (size_t) -1) { 
char const *b = beg + match_offset; 
/* Avoid matching the empty line at the end of the buffer. */ 
if (b == lim) 


break; 
+ /* Avoid hanging on grep --color "" foo */ 
+ if (match_size == 0) 
+ break; 
fwrite (beg, sizeof (char), match_offset, stdout); 


Example 1.1. Infinite loop patch in grep.c (commit 3ec7191f). 


code, according to a given set of requirements. For instance, incorrectly assigned 
values, uninitialized values, missing data validation, incorrect loop statements, 
and so on. We have recently argued that for devising sound repair techniques, 
we need a systematic taxonomy of common coding faults, so that we can develop 
an effective repair method for each type [12]. That is, to each bug corresponds a 
defect class and, to repair it, a specific repair method which exploits the defect 
class’ intrinsic properties is used. 

One well-known defect class is the “infinite loop”. Every computer programmer 
or user has, at least once, experienced this type of bug. It’s so much part of the 
programming folklore that Apple Inc. has renamed the street encircling its head 
quarters “Infinite Loop’ H. Infinite loops are responsible for hanged programs and 
frozen user interfaces. Technically, It consists of a loop which unintentionally 
iterates forever without returning an expected result or throwing an exception. In 
this paper, we aim to automatically repair this defect class. 

We believe the infinite loop defect class is fairly common. Take, for instance, 
one of the historically most popular UNIX commands: grep. Example [LI] shows 
the excerpt of a commit in grep’s codebasd} As indicated by the commit message, 
the changes of the commit are made to fix an infinite loop. In order to do so, the 
boolean condition of the loop is corrected and a break statement is introduced. 

In this paper, we propose a technique to automatically repair infinite loops. 
To our knowledge, there is no published work on this topic. We state the problem 
of repairing infinite loops in the context of test-suite based software repair [12]: 
given a test suite with at least one failing test, generate a patch that makes all test 
cases pass. Consequently, repairing infinites loop means having at least one test 
case that hangs by triggering the infinite loop. However, the loop that is running 
infinitely may be executed a finite number of times in other test cases. Hence, 
repairing an infinite loop means modifying the behavior of the infinite loop so 
that every test case invoking the infinite loop both halts and passes. In our case, 
the patch we aim to synthesize consists of a new boolean expression for the loop 
condition of the infinite loop. In other words, for the repair to be successful, the 
new predicate must correct every infinite execution happening in the non-halting 
test cases and must also keep the already passing test cases correct. 
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see last visited April 19, 2015. 
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Our system to automatically repair infinite loops is called Infinitel. It is 
based on a technique we develop to manipulate loops so that one can dynamically 
analyze the number of iterations of loops; decide to interrupt the loop execution; 
and dynamically examine the state of the loop on a per-iteration basis. Then, in 
order to synthesize a new loop condition, we encode this set of program states as 
a code synthesis problem using a technique based on Satisfiability Modulo Theory 
(SMT). 

We evaluate Infinitel on seven seeded-bugs and on seven real-bugs. Our 
technique is able to repair all of them, within seconds up to one hour on a standard 
laptop configuration. We deeply discuss those cases to understand the strength 
and weaknesses of our automatic repair technique. 

To sum up, the contributions of this paper are: 


The problem statement of automatic repair for infinite loops. 


A source code instrumentation technique to dynamically analyze the behav- 
ior of loops. 


e An end-to-end repair algorithm for infinite loops, based on runtime loop 
state analysis and code synthesis. 


e The evaluation of the proposed solution with 7 seeded bugs and 7 real bugs. 


The rest of the paper is organized as follows. In Section 2] we define a ter- 
minology for loops that is used throughout the paper. In Section [3| we describe 
our solution to automatically repair the infinite loop defect class. In Section [4] we 
discuss the evaluation of our approach. In Section [5] we analyze the core assump- 
tions behind our system. In Section [6] we compare our approach to other related 
work. We conclude the paper in Section [7] 


2 Loop Theory 


In this section we introduce the concepts and terminology related to loops that 
will be used throughout the paper. We also define the fault class we address. 


2.1 Terminology 


A loop is a control flow statement which permits to repeatedly execute a block 
of statements. Classical loops are for, while or do-while loops. The loop body 
refers to the block of statements inside the loop. 

The looping guard is the boolean condition used in a loop to control termina- 
tion. For instance, the looping guard in Example [2.1(a)] is “i < n”. The role of 
the looping guard is twofold. On one hand, before executing the first iteration of 
the loop, the looping guard acts as a precondition (but for do-while loops). If the 
precondition is not met (the first evaluation of the looping guard returns false), 
the flow of the program continues without entering the loop altogether. On the 


int index(int[] sorted, int e) { int method(int a) { 
int low = 0; int b = a; 
int high = sorted.length - 1; while (b > 0) { 
do { 18) { 
void clear(int[] array) { int mid = (low + high + 1) /2; a; 
int n = array.length; if (sorted[mid] <= e) { 
for (int i = 0; i < n; i++) low = mid; i 9) { 


array[i] = 0; } else { 
} high = mid; 
} 
} while (sorted[low] != e); 
return low; return b; 


J + 


(a) (b) (c) 


Example 2.1. for loop. |(b)| idempotent while loop. [(c)] while loop with a break and 


return statements. 


other hand, if the precondition is met, the first iteration begins and, from then on, 
the looping guard will be acting as an exit condition. If any subsequent evaluation 
of the looping guard evaluates to false, then the exit condition is met, and the 
flow can continue outside of the loop. 

A loop may also terminate when certain instructions are executed within the 
loop body (break and return statements, exceptions) A break statement is an 
instruction that breaks the loop from within the loop body (e.g., statement of 
second if in Example [2.1(c)). A return statement is an instruction that both 
breaks the loop and exits the function or method containing it (e.g., statement of 
first if in Example [2.1(c)p. 

A loop execution starts from the first time the looping guard is evaluated and 
ends when the flow of the program continues outside the loop. That is, even if 
a loop performs no iterations, we consider the single evaluation of the looping 
guard to false as a loop execution. The iteration record of a loop execution is the 
number of times the looping guard is evaluated to true during the loop execution. 

A door-door execution is a loop execution which ends because the evaluation 
of the looping guard returns false. The term is coined from the analogy of entering 
and exiting a room through a door, the conventional entrance into a room. On the 
contrary, a door-window execution is a loop execution which ends after executing a 
statement from within the loop body. In this case, the termination of the loop can 
have three causes: the evaluation of a break statement, the evaluation of a return 
statement, or the raise of an uncaught exception. Following the same analogy, the 
name suggests an unanticipated evacuation from a room throughout a window. 
The exit nature of a loop refers to the way a loop execution ends: conditional 
(door-door), break, return or throw exit (door-window). 

An infinite execution happens when a loop never halts; that is, when the 
looping guard keeps evaluating to true. We refer to “infinite loop” and “infinite 
execution” interchangeably. An idempotent loop is a loop whose looping guard 
can be evaluated arbitrarily more times than needed without changing the output 
of the algorithm. That is, in this type of loops, the correcteness of the loop is 


defined as a lower bound on the number of iterations. The loop in Example|2.1()| 
illustrates this phenomenon, in a binary search algorithm. If the looping guard 
is changed —maintaining the loop body intact- so that the loop performs a linear 
number of iterations, the output result of the algorithm wouls still be correct. 


2.2 Fault class 


The fault class we address is “infinite loop”. An infinite loop is the infinite repet- 
itive execution of the loop body. An infinite loop happens when the execution 
of the loop body does not change anymore the part of the execution state that 
is observed by the looping condition. An infinite loop is critical because: 1) the 
program is not responsive anymore; 2) the infinite loop consumes 100% of the 
CPU on the machine where it happens. 

Namely, there are two kinds of infinite loops, related to the two aforementioned 
roles of a loop condition: wrong precondition or wrong exit condition. In the first 
case, the bug occurs because the program does not skip the loop when it should. 
In the second case, the bug occurs because the loop does not terminate at the 
appropriate moment. 

To fix a wrong precondition bug, there are two possible repairs. First, one con 
wrap the loop within an if/then statement encoding the precondition. Second, 
one can modify the loop condition so that the precondition becomes correct while 
the exit predicate is still valid. 

For a wrong exit condition bug, there are three possible repairs: 1) changing 
the loop condition; 2) adding a window exit such as “if (X) break” or “if (X) 
return”; 3) changing the loop body such that the body correctly modifies the 
execution state that is analyzed in the loop condition. The automatic repair 
technique we present in this paper targets a change in the loop condition, which 
is able to both fix incorrect preconditions and incorrect exit predicates. 


3 Contribution 


In this section we present our approach to fixing infinite loops, called In finitel. 
We focus on while loops where the bug lies in the loop condition. According 
to the analysis presented in Section P| our approach repairs wrong exit condition 
of door-door loop executions. Our technique is based on test cases, the infinite 
door-door executions to be fixed are those manifested while running the test suite. 

In this context, “repairing” the infinite loop means finding a looping guard for 
the infinite loop such that each test case using that loop both halts and passes all 
the assertions. We first introduce the overview of our repair approach, and then 
we proceed by describing each step individually. 


3.1 Overview 


In Algorithm [I] we present the top level algorithm of our repair method. The 
input for our algorithm is the source code containing an infinite loop (parameter 


Algorithm 1 Top level algorithm to repair an infinite loop. 


1: procedure INFINITELOOPREPAIR(src, tests) 

2: src2 < INSTRUMENTLOOPS(src) 

3 loop + DETECTINFINITELOOP(src2, tests) 

4: thresholds +— FINDTHRESHOLDS(loop, src2, tests) 
5: patch + FINDPATCH(loop, src, tests, thresholds) 
6: return patch 
7: end procedure 


1] int method(int a) { 

2 int b= a; 

3| + LoopMonitor LM_83 = Global.getMonitor (83) ; 
4|+ int ITERS_83 = 0; 

5|- while (b > 0) { 

6|+ while (true) { 

7+ boolean stay = LM_83.decide(b > 0, ITERS_83); 
s+  LM_83.collect(stay, b, a, ...); 

əļ+ if (stay) { 

10| + ITERS_83 ++; 

11 if (b == 18) { 

12 return a; 

13 } 

14 if (b == 9) { 

15 break; 

16 } 

17 b -=1; 

18| + } else break; 


19] ¥ 
20 return b; 


Example 3.1. Illustration of our loop instrumentation on Example|2.1(c)| The code prefixed 
by +, in green, is automatically injected with source code transformation. 


src) and the test suite of the source code (parameter tests). The test suite is 
composed of passing tests and at least one hanging test, the one that triggers the 
infinite loop. 

The first step is to instrument the source code of the input project src. The 
instrumentation enables us to remotely control loop executions (for instance, to 
stop tests from hanging). Once the instrumentation is performed, the second step 
is to detect the presence of an infinite loop during the execution of the test suite. 
We do this by running the test suite and detecting hanging tests. 

In the third step, the goal is to find the number of iterations needed by the 
detected infinite loop to pass the assertions executed at the end of hanging tests. 
We call this number a “threshold” for that loop. When breaking the infinite loop 
beyond the threshold, the test case passes. 

In the last step, a new looping guard is synthesised, this is the final patch. The 
detailed explanation of each step is given in the following sections (Subsections[B.2] 


Band B.5). 


3.2 Project Instrumentation 


We explain here how to modify the implementation of a loop in order to control 
its execution. The idea is to “implant” a hook in the loop source code to modify 


Algorithm 2 Detecting infinite loops with instrumentation.. 
1: procedure DETECTINFINITELOOPS(src, tests) 

2: hangingTests + {} 

3: monitors + IMPLANTEDMONITORS() 

4: SETLIMITINALL(monitors, 1000000) 

5: for test € tests do 

6: RUN(src, test) 

7 for monitor € monitors do 

8 if monitor. HASEXCEEDINGEXECUTION() then 


9: loop + monitor.GETLOOP() 

10: invocation + monitor.GETEXCEEDINGEXECUTION() 
11: hangingT ests.PUT(test, loop, invocation) 

12: end if 

13: end for 

14: end for 


15: return hangingT ests 
16: end procedure 


the semantics of the loop at runtime. Specifically, we want to control the looping 
guard. The loop instrumentation is shown in Example Firstly, we fetch 
the loop monitor who will control the loop (line 3). Secondly, a local variable is 
created to store the iteration record of each loop execution (line 4). Then, we 
modify the original loop by wrapping the original loop body (lines 11-17) within 
an if statement (lines 9-18). The original looping guard is deleted (line 5) replaced 
by a while(true) (line 6). Now, the decision to proceed with a new iteration or 
break the loop is delegated to the loop monitor. The decision to keep executing 
the loop or to break is stored in another local variable (line 7). Then, according 
to this decision, either a new iteration is carried out (then branch of the new 
wrapping if) or the loop breaks (else branch). In the former case, the local 
variable is incremented (line 10). Finally, we add one more statement on this 
instrumentation to collect the execution information of each iteration (line 8). 


3.3 Infinite Loop Detection 


Our method to detect infinite loops is straightforward. We keep track of the 
number of iterations throughout a loop execution and, if a maximum number 
of iterations is exceeded, we assume it is an infinite loop. We implement this 
detection strategy with the non-trivial instrumentation explained in Section 

During the loop execution, the loop monitor is responsible for deciding whether 
to iterate or break before starting a new iteration. To do this, it receives the evalu- 
ation of the original looping guard and the number of already completed iterations. 
If this number exceeds a maximum number of iterations, the loop monitor labels 
the loop as “infinite” and breaks it. The threshold is fully parameterizable, we use 
a reasonable value of 1 million. Across all our experiments, this has only yielded 
one false positive (wrongly detected as infinite loop). 

The infinite loop detection is detailed in Algorithm 2] At this stage, the 
parameter src is the instrumented source code and the parameter tests is the 
test suite. We simply run the whole test suite on src. Every loop execution is 


T 


Algorithm 3 Identifyig angelic record s. 
1: procedure FINDTHRESHOLDS(loop, src, tests) 
2: thresholds + Dictionary.NEW() 
3: monitor + loop.GETMONITOR() 
4: hangingT ests ~ HANGINGTESTSOF (loop) 
5: for test € hangingTests do 
6: number 4+ GETEXCEEDINGEXECUTION(test, loop) 
7 for (i = 0; i < 1000000; i + +) do 
8 monitor.SETLIMITIN(number, i) 


9: result < RUN(src, test) 

10: if result. ISSUCCESFUL() then 
11: thresholds.PUT(test, i) 

12: break 

13: end if 

14: end for 

15: end for 


16: end procedure 


monitored by a loop monitor. In the event of an infinite execution of a hanging 
test, the loop monitor will detect the infinite execution and it will break the loop 
when the threshold is exceeded. Also, because this infinite loop is detected during 
the infinite execution, the loop monitor stores the invocation rank of the infinite 
execution (for instance, “the fourth loop execution of hanging test testABC”). 
The output of this algorithm is a specific data structure that contains the list of 
hanging tests, the infinite loop where each one hangs, and the invocation rank of 
the infinite execution in each case. 


3.4 Finding Thresholds in Hanging Tests 


A hanging test, when executed, gets trapped in an infinite loop execution because 
the looping guard never evaluates to false. That is, the looping guard does not 
break the loop when it should. To rectify this, we have to amend the looping 
guard so that it breaks the loop during the infinite execution at the appropriate 
moment. Hence, we first have to determine the appropriate moment to break the 
loop in each infinite execution. 

We estimate the appropriate moment to break the loop in an infinite execution 
by controlling the iteration record of the infinite loop in that execution. As seen 
in Section we can use the loop monitor to break any loop by simply setting 
a maximum value of permitted iterations. If we set the maximum value equal 
to x during the infinite execution of an infinite loop, and we observe that the 
hanging test both halts and passes, then we have found this appropriate moment, 
it’s when y iterations have been executed. We refer to the target y value as an 
“angelic record”. We use this terminology based on the literature terminology ([4], 


[6)). 

Our method to find the angelic record of a hanging test is simple: we explore 
values from 0 to the predefined threshold in order, run the hanging test each time 
and assess whether it passes. If it does, the probed value is the angelic record y. 
The rationale of this simple strategy is that we have observed that in real test 


suites, the number of loop iterations is likely a low value (less than 20). Therefore, 
not only we expect the angelic record to be within that range (0 and 1 million), 
but, also, we know that probing in ascending order is the fastest way to find the 
angelic record. 

The angelic record search is detailed in Algorithm B] We receive a detected 
infinite loop (parameter loop), the instrumented source code (parameter src) and 
the test suite of the project (parameter test). From the previous step (SectionB.3), 
we already know the hanging tests of an infinite loop. Then, for each hanging 
test, we probe different values until we find the angelic record. We do this for all 
hanging tests. We store this information in an associative array where the key is 
an infinite loop under repair. 


3.5 Patch Synthesis 


To synthesize a new looping guard, we use a program synthesis technique. The 
idea is to synthesize a new looping guard that would make all test passing. 


3.5.1 Synthesis as an SMT Problem 


In this section we briefly introduce the code synthesis method to be used in this 
paper. The goal of code synthesis is to synthesize a program WV which complies 
with a specification. Input-output synthesis is one kind of synthesis: for any 
specified input J, the synthesized program WV should output an acceptable output 
O. To synthesize Y, the code synthesis algorithm receives an “input-output” pair 
set V. For any given pair (J,O) € V, whenever J is the argument of Y, then the 
program has to return the value O. 

We use component-based synthesis [10]. In addition to input-output pairs, this 
synthesis algorithm takes a set of “base components” C (operators such as >). For 
any given pair (J,O) € V, whenever J is the argument of Y, then the program 
has to return the value O; and, in order to compute this value, it must only use 
operators from the C set. 

Component-based synthesis encodes C and V as first-order logic constraints. 
The constraints both describe the syntax of the algorithm (such as number of 
lines, declaration of local variables, etc) and the semantics (to make the program 
compliant with the specification). Then, an SMT solver is used to decide whether 
there exist a solution satisfying all constraints. If there is, the solution is decoded 
back and translated into an algorithm. 


3.5.2 Synthesis of a New Looping Guard 


We now use the synthesis method described in Section to generate a new 
looping guard for the infinite loop. The specification of the new looping guard 
can be informally expressed as follows: the looping guard predicate should allow 
every test executing the infinite loop to both halt and pass. 

To synthesize the new looping guard, we need two arguments: the input-output 
pair set and the component set. We first explain how to yield the latter, and then 


Algorithm 4 Obtaining the input-output pair set. 
1: procedure SPECIFICATION (loop, thresholds, src) 
2: V + Set.NEW() 
3: monitor < loop.GETMONITOR() 
4: tests < TESTSOF (loop) 
5: for test € tests do 
6: if ISHANGINGTESTOF(loop, test) then 
7 number + GETEXCEEDINGEXECUTION (test, loop) 
8 threshold + thresholds.KEYFOR(test) 


9: monitor.SETLIMITIN(number, threshold) 
10: end if 
11: RUN(src, test) 
12: pairs + monitor.GETPAIRS() 
13: V.ADDALL(pairs) 
14: end for 


15: end procedure 


the former. 

Component Set: The component set used for synthesis contains different 
kinds of operators. In Infinitel, we use comparison operators (CS, Cs, C=, Cz), 
then logic operators (Chot, Cor, Cana), then linear arithmetic operators (C4, C_), 
then if-then-else (Ce), and, finally, multiplication (Cx). 

The selection of components is done increasingly. We start off with an empty 
set of components. In this case, there is only one possibility to find a new looping 
guard: the patch uses a boolean input variable. That is, an input į € J of the 
specification is equal to the corresponding output for every (J,O) pair of V. If 
this is the case, then the new looping guard is simply “while (i)”. 

If not, we formulate a new SMT problem with the same specification V and 
a new non-empty component set. If we succeed, the synthesis phase is finished. 
If not, we keep adding components until either a new looping guard is found, or 
until we exhaust all of the available components and finish the synthesis phase 
unsuccessfully. 

Input-Output Pair Set: The input-output specification V is assembled as 
follows. As explained in the designed project instrumentation (Example [3.I), the 
loop monitor is a reification of the looping guard: it decides to iterate or break 
the loop in each iteration. The collection of input-output pairs is done by the loop 
monitor. At each iteration it creates an (I, O) pair associating the decision of the 
loop monitor to O and the context information to J, whose collection is described 


in Section .5.3} 


3.5.3 Runtime Value Collection 


The context information is collected by the loop monitor within the callback in 
line 8 of Example The context information reflects the local state of the 
program at each iteration. It is composed of variables collected in 6 different 
ways: 

Reachable variables hunt: we scan the scope of the loop to gather every reachable 
variable. A reachable variable is a variable with two qualities: it is accessible 
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variable.size() 


Collection (incl. Lists and Maps) qar table lekmetyO 


Table 1: Queries for each type. 


from the loop scope (it is declared within the lexical scope of the loop) and it is 
initialized. It could either be a local variable, a method parameter or an instance 
field. 
Visible field access: for each reachable variable of a user-defined type, we also 
gather its visible fields. 
Getters: for a reachable variable of a user-defined type, we also include, when 
possible, non-visible fields. To do this, we review the source code declaration of 
the variable class in search of “getter methods”. A getter method is a method with 
the following characteristics: it has no parameters, it is implemented in one line, 
the line is a return statement, and the returned element is an instance field. 
Recycling of the original looping guard: although the original looping guard is not 
used as the real looping guard after the loop instrumentation, it is highly likely 
that it still provides precise information about the iteration context. For this 
reason, we also include the value of the evaluation of the original looping guard. 
Subvalues of the original looping guard: whenever possible, we also inspect the 
values of subcomponents of the original looping guard. For instance, if the original 
looping guard is a conjunction, we also include the evaluation of each subpredicate 
of the conjunction. 

In essence, we have many options to gather context information within the 
lexical scope of the loop. However, only boolean or numeric values are supported 
by SMT solvers. For this reason, we need to further refine the amassed variables: 


Extraction by value: for a variable of primitive type (boolean, char, int, double, 
etc.) we take its value. 


Extraction by queries: for each gathered variable of a non-primitive type, we 
perform different queries. First, we check nullness; and, if the variable is not null, 
we also extract information by using a hardcoded list of typical queries (such as 
the length of a String, or the size of a List). We can see the hardcoded queries 
in Table [1] The way to interpret the table is: if the variable’s class subclasses a 
given superclass, then we perform the corresponding queries on the variable. 
The last step consists of a refinement of the input-output pair set, in order to 
improve the SMT synthesis. Firstly, we enrich each set I of every (J, O) pair with 
the constant values of —1, 0 and 1. These are values commonly used in predicates, 
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Algorithm 5 Patch synthesis. 


1: procedure FINDPATCH(loop, thresholds, src, tests) 
2: spec + SPECIFICATION(src, loop, thresholds) 


3: components + List.NEW() 

A: while not EXHAUSTEDALL(components) do 

5: smtProblem + ENCODETOSMT(spec, components) 
6: if smtProblem.ISFEASIBLE() then 

7: solution + smtProblem.SOLUTION() 

8: patch + DECODETOPATCH(solution) 

9: break 
10: end if 
11: bundle <- NEXTCOMPONENTBUNDLE() 
12: components.ADDALL(bundle) 


13: end while 
14: end procedure 


so we make sure they are available for code synthesis. Then, we remove input 
elements which have the same value in every (J, O) pair. 


3.5.4 Synthesis Algorithm 


The algorithm of the new looping guard synthesis can be seen in Algorithm B] 
The first step is to collect the input-output pair set (detailed in Algorithm 4). To 
do this we simply execute each test using the loop and fetch the collected input- 
output pairs after each run. For hanging tests, the expected output is based on 
the found infinite execution thresholds. Once we obtain this specification, we start 
the search of a new looping guard. We begin with an empty component set, we 
formulate an SMT problem and use a solver to find a solution. If we succeed, we 
transform back the SMT solution into a boolean code expression, the patch. If 
not, we add components to the component set and formulate a new SMT problem. 
We do this until we exhaust all of the components or a correct looping guard has 
been synthesized. 

To explain the details of this synthesis technique, it requires a large amount 
of space. Since it is not the contribution of this paper, we refer the reader to the 
original paper and our paper presenting our adaptation of the technique to 
handle object-oriented code [6]. 


4 Evaluation 


In this section we present the evaluation of In finitel, our system for automatically 
repairing infinite loops. Our evaluation is based on the repair of 7 seeded bugs 
and 7 real bugs. We aim to answer the following research questions: 


RQ1 Performance: does Infinitel solve the bugs in a reasonable amount of time? 
What is the bottleneck of the repair method? 


RQ2 Appropriateness: are the patches synthesized by Infinitel appropriate? 
How do they compare with the human-written ones? 
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RQ8 Synthesis: How hard is the code synthesis for each bug? does the synthesis 
based on SMT satisfiability scale? 


RQ4 Angelic Record (Section [3.3): does the 1 million iteration limit used for 
finding the angelic record have type I errors (false positives)? Does it affect 
the performance of In finitel? 


RQS State Observation (Section |3.5.3): how does our value collection technique 
qualify the loop state in each bug? 


RQ6 Idempotence: is there any idempotent loop among the real bugs? 


4.1 Evaluation Setup 


Infinitel is implemented in Java, running on an Oracle JRE version 7, with a 
maximum heap of 2 GB. The SMT solver used is 73°} version 4.3.2. The operating 
system where the evaluation is performed is OS X Mavericks. 


4.1.1 Methodology 


Our evaluation is based on the repair of 7 seeded bugs and 7 real bugs. In both 
cases, each bug consists of one infinite loop with at least one hanging test. The 
magic number 7 comes from the fact that we were able to reproduce 7 real bugs 
within the 6 weeks we allocated for bug reproduction, and we wanted the same 
number for symmetry in the result tables. 


Seeded Bugs 


A seeded bug is a project deliberately “infected” with a manually created infinite 
loop. We first create four toy projects which only have one class, one test class 
and one infinite loop. They are called Ex {1... 4}. 

In addition, we seed another 3 infinite loops in large-scale open-source projects. 
Those seeded bugs are more representative than the toy projects because they are 
at real scale. While the bugs are artificial, the meaning of performance metrics 
are meaningful for those cases. The process is as follows, we select a loop in the 
project and we perform two small transformations on it. Firstly, we substitute the 
looping guard with “while (true) f Secondly, the loop body is wrapped with 
a try/catch with an empty catch block. This is done to prevent an exception 
from breaking the infinite loop. 

For the large scale evaluation seeded bugs we use Apache’s Commons Collections 
and Math projects (commits b5ffdaf and 32ef444). The first two seeded bugs 
are in two different loops on Collections (AbstractMapBag. java on line 590 and 
AbstractDualBidiMap. java on line 352), whereas the third one comes from an 
infected loop on Math (FastMath. java line 3 120). 


http: //z3.codeplex.com/ 


“Actually, changing to“while (true)” would raise compilation errors because of the presence 
of unreachable code after the loop. Hence, we use an equivalent form: “while ("".isEmpty() )”. 
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Name Repository Commit Subproject Test 
git://git.apache.org/commons-csv.git) 4dfc8ed — 


CSV 


fop git://git.apache.org/fop.git 13984cc - N 
pdfbox A blocf48 N 
pdfbox B git://git.apache.org/pdfbox. git a2ab77£ fontbox N 

pig git://git.apache.org/pig.git 5abfbd0 piggybank Y 

tika git://git.apache.org/tika.git 1b694e7 tika-parser N 

uima 155596a  jVinci N 


Table 2: Our dataset of 7 real bugs, up to the commit ID (the first 7 digits of the 
commit checksum). 


Real Bugs 


The seven real bugs come from existing projects of the Apache Git repositories) 
To find real bugs, we have individually analyzed the projects looking for commits 
reporting and fixing an_infinite loop bug. Specifically, we perform a keyword- 
based search on the Git] log of each project repository (keywords: infinite, loop, 
iteration, hang, endless, ending, terminating). 

We describe each real bug in Table [2] In the case of csv and pig the commit 
includes code changes to fix the infinite loop and a test case validating those 
changes (i.e. triggering the infinite loop). For the rest of the commits, the test 
cases triggering the infinite loop are missing. Consequently, we manually created 
tests for the 5 remaining commits. The policy followed to manually create tests 
is the following: a) at least one of these tests has an infinite execution of the 
loop attempted to be fixed by the commit changes; b) the added hanging tests 
halt and pass with the changes introduced in the commit; and, c) the added and 
not hanging tests, if any, pass. (We should mention that the pdfbox B bug is 
detected and incorrectly reported as fixed in commit e41cbd1, but it is actually 
fixed in later commit a2ab77f. We use the buggy loop in the first commit and use 
the second commit to compare the synthesized looping guard with the manually 
written fix.) 

Note that it takes a lot of time to collect and reproduce real bugs of a given 
defect class. For those 7 bugs, it took us more than 6 weeks. For sake of compar- 
ison, the close related work on infinite loops use respectively eight bugs and 
one single bug [2] Q] in their evaluation. 


4.1.2 Metrics 


We now present the different evaluation metrics about the automatic repair of 
each bug. We group them in two categories: “basic metrics” and “time metrics”. 
Unless indicated otherwise, each time metric is rounded to seconds. Time metrics 


are used to answer 


“http: //git.apache.org/ 
“http: //git-scm.com/ 
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Basic Metrics 


Tests: the total number of tests in the test suite of the project. 

Application Classes: the total number of declared classes in the project source 
code, excluding test classes. This metric is also equal to the total number of 
classes which are instrumented and recompiled during the project instrumentation 
(Section B.2). 

Added Tests: the number of tests added to reproduce the infinite loop bug (only 
in real bugs). 

Added LOC: the total number of lines of the added tests (only in real bugs). 
Hanging Tests: the number of invoking tests which do not halt due to the infinite 
loop. 

Idempotence: whether the hanging tests pass or not when they are run with an 
arbitrary large number of executions. If they do, we suspect that the infinite loop 
behaves like an idempotent loop (see Section B). 

Angelic Record: the value of the highest angelic record for the infinite loop under 
repair. 

Context Items: the size of the input-output pair set described in Section 
This number impacts the number of constraints in the SMT problems created 
during code synthesis. 

Context Size: the number of inputs inside each input-output pair, plus 1 (for the 
output value). It represents the number of extracted values being used to describe 
the state of each loop iteration (Section [3.5.3). 

SMT Formulations: the number of total SMT problems needed to synthesize 
a patch. As indicated in Algorithm D| we successively create SMT problems 
by adding new components until the synthesis succeeds. The number of SMT 
problems needed to find a solution is a proxy to the number and complexity of 
the components used for synthesis, it enables us to qualify the difficulty of the 
found patch. 

SMT Components: the number of total components used in the synthesized patch 
(Section [3.5.1). 

SMT Component Types: the number of different component types used in the 
synthesized patch (there are 5 different types: comparison, logic, linear arithmetic, 
multiplication and if-then-else). 

LOC; lines of code in the project source code, excluding test code. Figures are 
obtained with cLodÌ. 


Time Metrics 


Instrumentation: time to implant the loop monitors in every while of the project 
source code (Section B.2). 

Compilation: time to compile the instrumented source code. 

Test Suite: time to run the test suite of the project. This metric includes the time 
of running -and inducing loop termination of- hanging tests. 


http://cloc.sourceforge.net/ 
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Basic Metrics 


— 
Seeded Math Ex. L Ex.2 Ex.3 
LOC 


Application Classes 
Tests 

Hanging Tests 
Context Items 
Context Size 

SMT Formulations 
SMT Components 
SMT Comp. Types 
Angelic Record 
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Real 
LOC 
Application Classes 
Tests 
Hanging Tests 
Context Items 
Context Size 
SMT Formulations 
SMT Components 
SMT Comp. Types 
Angelic Record 
Added Tests 
Added LOC 
Idempotence 


Table 3: Evaluation of Infinitel on our Dataset. 


Hanging Tests: time to run hanging tests of the infinite loop by breaking after 
the maximum number of iterations. Every infinite execution is interrupted after 
a maximum iteration number is reached (Section B.3). 

Angelic Value Mining: time to find the angelic records of each hanging test (Sec- 
tion [3.4). 

Value Collection: time to collect contexts for tests invoking the infinite loop (Sec- 
tion 8.5.3). 

SMT Solving: overall time solving all SMT problems until a solution is found. 
Total Time: the sum of the previous 7 metrics, it is the total execution time to 
automatically fix the bug. 


4.2 Empirical Results 


We answer the research questions presented in Section [4] 
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4.2.1 Descriptive Statistics 


Table [8] summarizes all the evaluation metrics. The table is composed of two 
parts. The upper part is about the 7 seeded bugs, the lower part is about the 7 
real infinite loops. 

For each of them, the first lines give descriptive statistics, the number of lines 
of code, the number of application classes (excluding test classes), the number of 
test methods (the basic test unit in the testing framework JUnit), and the number 
of test methods that hang because of the infinite loop. For the seeded bugs, one 
can see that the first three infinite loops have been seeded in large projects: the 
version of Apache Commons Collections under repair, called “Collections A”, has 
25 338 lines of code, over 463 classes. The test suite specifies 14 792 test case. The 
seeded infinite loop breaks between 1 (for Math, Ex {1...4}) and 57 test cases 
(for Collections B). 

For the real bugs, the projects range between 1 218 LOC (for csv) and 157 445 
LOC for fop. The number of applications classes and test cases follow the same 
trend. For 5/7 real bugs, there is one single test case that triggers the infinite 
loop, however, there are 3 (resp. 2) hanging tests for fop (resp. pdfbox A). 


4.2.2 Performance 


Our automatic repair algorithm automatically fixes all seeded and real bugs. Let 
us now discuss the performance of the technique. Does In finitel solve the bugs 
in a reasonable amount of time? What is the bottleneck of the repair method? 
For the seeded bugs, the interesting cases are the bugs put in large-scale projects. 
Looking at the time metrics in Table [4] we notice that the total execution time 
for the Apache projects is around 10 minutes. In all three cases, the bottleneck 
is the running time of the test suite. In the case of Collections B, the hang- 
ing tests account for approximately 50 % of the time of running the test suite 
(before breaking after 1 million iterations) the reason is that the seeded bug in 
Collections B creates 57 hanging tests. 

We now look at the performance metrics for the real bugs in the bottom part 
of Table [4] We can see that fop has the longest repair time with approximately 1 
hour. Then comes uima with almost 49 minutes and csv with roughly 30 minutes. 
For the other 4 real bugs, 10 minutes is enough to repair them. We now analyze 
the bottleneck of each bug individually. 

In csv the clear bottleneck is the time of SMT Solving: 99 % of the total repair 
time is spent in that task. Something similar happens in fop, with 88 %. This is 
due to the size of the SMT problems in csv (where each of the 6 631 context items 
accounts for a constraint in the SMT problem) and to the complexity of the found 
patch in fop (it requires many components as witnessed by the highest number 
of SMT problems: 4 — which means that new component types were added four 
times in a row). 

In uima, running the hanging tests is the evident bottleneck. This is due to a 
performance overhead caused by the string concatenation operation. The infinite 
loop in uima only has one statement which is a concatenation of strings with the 


17 


sum operator; consequently, for each of the one million iterations before the forced 
break, two strings are created and copied. 

In pdfbox A and pig, the bottleneck seems again to be related to the test cases. 
In pdfbox A almost 70 % of the time to repair the infinite loop (84 seconds) is 
spent for executing the test suite (57 seconds) or collecting the values at runtime 
(27 seconds). This figure increases to 95 % of the time (503 seconds) in pig (442 
seconds running the test Suite and 61 seconds for Value Collection). 

In tika the combined time for running the test suite and the SMT solving 
amounts for more than 90 % of the total repair time. Finally, in pdfbox B the 
SMT solving is negligible and in this case, the repair time is dominated by the 
project instrumentation. 

To sum up, according to our dataset, In finitel is able to fix infinite loops on 
a standard laptop computer. 


4.2.3 Appropriateness 


Are the patches synthesized by In finitel appropriate? How do they compare with 
the human-written ones? Now we assess the appropriateness of the synthesized 
looping guards. For the seeded bugs, for Ex {1...4} the patch is obviously the 
expected one since we craft those examples manually. For Collections B the 
original looping guard (before seeding the bug) was restored. For Collections B 
and Math, the original looping guard is restored semantically, but not syntactically. 
In the case of Math, whereas the original looping guard checks that the variable 
mantissa is lower than 25? using bitwise right shift operator, the found looping 
guard does so by comparing with the value of constant TWO_POWER_52, which is 
equivalent. 

Let us now concentrate on the real infinite loops. For each subject, Table 
gives both the synthesized patch and the human fix. We consider csv and pig 
bugs. In the case of pig, the human fix and the found patch are equivalent 
(fileStatusArr is an array, so the length is either 0 or positive). For csv, despite 
that the fixes are different, both the human fix and the found patch base the 
looping guard on the value of tkn.type. According to our understanding of the 
program, they are equivalent. 

We now consider tika bug. Our found patch simply restricts the original 
looping guard with an additional on the block length (first operand of the patch). 
On the contrary, the human fix is more complex, because it uses an additional 
boolean variable continueLoop. This new variable is updated at the end of every 
iteration, and the value of this variable is used in the new looping guard. This 
manual code is likely more readable than the synthesized one. 

If we analyze fop bug, we will again find different repair strategies. This time, 
the human fix adds a break statement at the end of the loop body. As we have 
already discussed in Section [2.2] a behavior equivalent to adding a break can often 
be obtained by modifying the looping guard itself, this is what In finitel does. 

The same happens when we compare the found patch and the human fix in 
bug pdfbox A . The human wraps the while with an if statement that acts as a 
precondition. Logically, In finitel finds that the angelic record for the hanging 
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Time Metrics (in seconds) 


Seeded Math Ex. : Ex. 2 Ex. : 


— 
Instrumentation A 4 
Compilation 3 12s 
Test Suite 
Hanging Tests 
Angelic Record Mining 
Value Collection 
SMT Solving 
Total Time 


Total Time (readable) 


Real 
Compilation 


Test Suite 

Hanging Tests 

Angelic Record Mining 
Value Collection 

SMT Solving 

Total Time 


Total Time (readable) 


0:00:07] 0:08:45 


Table 4: Performance of /nfinitel on our dataset. 
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Ex.4 


2 911s 


0:48:31 


- while (condition) { + boolean flag = true; i aoe + BE Gast l 
+ while (newCondition) - while (...) { ad while (...) { 
+ while (flag && ...) { 


+ flag = s.4e03 
} 


(a) (b) (c) (d) 
Example 4.1. Human fixes of: csv and pig; [(b)| tika; [(c)] fop; [(d)| pdfbox A 


tests in pdfbox A (Table [8) is 0, indicating that the loop body should not be 
executed at all. Later on in the repair process, In finitel synthesizes an expression 
that indeed acts both as a correct precondition and a correct exit condition. 

For the two remaining ones pdfbox B and uima, alternative strategies are used 
in the human fix. The human fix for pdfbox B modifies a method invoked in the 
looping guard. The developer patch of uima adds a statement to the loop body. 

To sum up, there are many alternative strategies to repair an infinite loop. 
Listings Example Example Example [4.1(c)| and Example [4.1(d)| 
summarizes them. This evaluation shows that our repair strategy (modifying the 
looping guard) is as powerful as the others. Having a single repair strategy enables 
us to greatly reduce the search space of the patch. 


4.2.4 Synthesis 


How hard is the code synthesis for each bug? Does the synthesis based on SMT 
satisfiability scale? We now concentrate on the code synthesis part based on SMT. 
We look in particular at the number of SMT problems generated. Recall that there 
is one SMT problem generated per set of operands to be used for synthesis (row 
SMT formulations in Table B). 

Let’s first look at the three bugs seeded in real code of project Collections A, 
project Collections Band Math. The number of SMT problems for Collections 
A, Collections B and Ex.2 is 1. It means the synthesized looping guard directly 
refer to a boolean variable in the scope (e.g. “while (notDone)”). Logically, the 
synthesized code has no operators, which can be seen in the row giving the number 
of SMT components. Recall that when the number of SMT components or the 
number of SMT component types is zero it means that the synthesized condition 
does not use any operator but only a boolean variable that is present in the scope 
(e.g. “while(notDone)”). For Math, the synthesized patch is found for the second 
SMT problem, using one single operator (<). 

We now analyze the code synthesis method for real bugs. The number of SMT 
formulations ranges from 4 (for fop) to 2 (for csv, pdfbox A and uima). Those 
numbers directly refer to the complexity of the synthesized patch, where several 
operators are needed, as shown in Table For instance, the number of SMT 
formulations of fop is 4 and the corresponding patch contains 5 boolean clauses 
(fourth row). In Table [5| note that the operators in blue are not handled by 
SMT, since it comes from the original loop condition and is evaluated as is. In 
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csv 
faulty !tkn.isReady 
'tkn.isReady && tkn.type != TT_EOF 


(tkn.type)<(0) 


pig 
faulty (!((fileStatusArr = fs.listStatus(path)) == null || fs.isFile(path))) 


(!((fileStatusArr = fs.listStatus(path)) == null || fs.isFile(path) || 
fileStatusArr.length == 0)) 


infinitel (!(((fileStatusArr = fs.listStatus(path)) == null) || (fs.isFile(path)))) 
&&((0)<(fileStatusArr. length) ) 


tika 
getContentLength() < getBlockLength() 
adding variable continueLoop: 
continueLoop && getContentLength() < getBlockLength() 
((getContentLength()) < (getBlockLength())) 
&&((! ((this.chmSection.getData() .length)==(this.state.getWindowSize()))) 
| | (this.state. getMainTreeTable() !=nul1) ) 


manual 


fault (scale < 1 && nextStepFontSize > baseFontSize || 
y scale > 1 && nextStepFontSize < baseFontSize) 


adding a break statement 


(((scale < 1) && (nextStepFontSize > baseFontSize)) || 
infinitel ((scale > 1) && (nextStepFontSize < baseFontSize))) 
&& (( (FontSizePropertyMaker .FONT_SIZE_GROWTH_FACTOR)+ 


fop 


((FontSizePropertyMaker . FONT_SIZE_GROWTH_FACTOR) - (nextStepFontSize)))<(-1)) 


pdfbox A 
faulty (amountRead = rawData.read(buffer, 0, Math.min(mayRead,BUFFER_SIZE))) ! 
manual adding wrapping if 


((amountRead = 
infinitel rawData.read(buffer, 0, Math.min(mayRead,BUFFER_SIZE))) != -1) 


&& (filterIndex) <(amountRead) ) 
pdfbox B 


(amountRead = 
faulty read(data, totalAmountRead, number0fBytes-totalAmountRead) ) ! 


&& totalAmountRead < numberO0fBytes 


modifying read() method 


((amountRead = 
infinitel read(data, totalAmountRead, (number0fBytes - totalAmountRead))) != (-1)) 
&&(totalAmountRead < number0fBytes) ) 


&& ((amountRead)==((number0fBytes - totalAmountRead) )) 


uima 
faulty offset > 0 
manual modifying loop body 


(indent .length()) !=(offset) 


Table 5: Patches synthesized by In finitel for the 7 real bugs of our dataset. The 
code in blue (on screen or a color-printed version) is an expression that is reused 
from the original patch condition. 
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our approach, for each new SMT problem, new operators (component) are added 
for synthesis. Logically, the greater the number of SMT problems, the greater 
the number of SMT components and component types, as shown in row “SMT 
components” and “SMT component types”. Since at least a component is required 
to find the patch, it means that there is not a single variable that can be used to 
describe the completion point alone (such as “set . isEmpty ()”). 


4.2.5 Angelic Record 


Does the 1 million iteration limit used for finding the angelic record have type I 
errors (false positives)? Does it affect the performance of In finitel? Recall that 
the angelic record is the minimum number of executions required to break the 
infinite loop while the hanging test passes. An angelic record of 0 means that 
the loop must be skipped, i.e. that the loop precondition is not met. We use 
a threshold of one million, which means that if we detect a loop that has more 
than one million iterations in a single execution, we label it as infinite. A false 
positive would be a loop that indeed needs more than one million iterations, and 
thus would incorrectly be detected as infinite. Among the real bugs, there was no 
false positives. The maximum angelic record is 45 (for fop) which is far beyond 
the maximum value. For seeded bugs, the maximum value is 52 (Math). Indeed, 
finding the angelic record is the least expensive in terms of time, as shown in 


Table [4] 


4.2.6 State Observation 


How does our value collection technique qualify the loop state in each bug? The 
goal of our runtime value collection technique (Section [B.5.3) is to collect the 
variables that correctly capture the state of the program. In particular, it must 
contain the variables that are required to synthesize a correct looping guard. 
Now, we look at the elements used in the synthesized looping guard to assess the 
importance of each phase of our runtime value collection technique. 

For seeded bugs, we use extraction by queries, extraction by value from reach- 
able variables, subvalues of the original looping guard and getters. For real bugs, 
the patches are given in Table 5] We use visible field access in csv (“tkn.type”). 
We use recycling of the original looping guard in pig, tika, fop and pdfbox B 
(coloured in blue in the table — on screen or a color-printed version). We use 
extraction by queries in pig (“fileStatusArr.length”). We use subvalues of 
the original looping guard in pdfbox B (“number0fBytes-totalAmountRead” in 
the right hand side of the equality component). We use extraction by value of 
reachable variables (such as the static field FONT_SIZE_GROWTH_FACTOR in fop). 
We also use getters in tika (e.g., “this.state. getWindowSize()”). 

To sum up, all components of our runtime value collection technique are useful, 
since that they are all present in the found patches. 
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4.2.7 Idempotence 


Is there any idempotent loop among the real bugs? In Section P| we have made the 
analytical arguments that some loops may be idempotent; that is, it is possible to 
add any arbitrary number of iterations beyond a threshold without breaking the 
correctness of the computation. We are now interested in the idempotence of the 
real bugs. The experimental setup is as follows: we simply observe whether the 
hanging test passes after the maximum number of iterations (one million). 

Surprisingly, we remark that the infinite loops of csv, fop, pdfbox A, pdfbox 
B and pig are all idempotent. This aspect could be leveraged during the code syn- 
thesis phase: it may occur that the looping guard becomes “easier” to synthesize if 
more iterations are performed. By easier, we mean it could involve less variables 
or less SMT components. In this case, this would improve the SMT Solving time, 
which are particularly high for csv and fop. However, this optimization remains 
out the scope of this paper and is left for future work. 


5 Discussion 


Our approach for automatically repairing infinite loops is built on three assump- 
tions. 

The first one is that in each test case there is at most one infinite execution. 
That is, if the infinite execution is interrupted, any subsequent invocation will 
be finite. We assume that the once-infinite loop is invoked zero or more times 
with finite invocations, but at most once with an infinite invocation. We use 
this assumption in Algorithm [3] because we only probe the angelic record for that 
single infinite execution. 

The second assumption is that the hanging tests have a deterministic execu- 
tion. Let us assume that in a given test case, a loop is executed several times 
(for instance by calling n times the method containing it) before entering in the 
infinite non-terminating mode. If during the execution of a hanging test the nt” 
invocation of an infinite loop is an infinite execution of the loop, then the n? 
invocation of that loop is the infinite invocation on every execution of that test. 
We use this assumption in Algorithm [3] because we probe the angelic record in a 
specific execution number. 

Non-determinism in passing tests may also impact the synthesis of the looping 
guard. We only run each invoking test once to collect runtime values; and the 
synthesized looping guard guarantees to be correct only for the given input-output 
pair set. However, suppose a passing test produces two different set of input- 
output pairs within the nt? loop execution of two different runs. Namely, (Ia, O) 
and (l, O). That is, the original looping guard evaluates to the same boolean 
value in the nt” iteration, but with two different states. Because we synthesize the 
looping guard using only one of these inputs, say Ja, there are no guarantees that 
the looping guard would also evaluate to O for input Jẹ. Hence, the synthesized 
looping guard would make the once-passing test fail intermittently. 

The third assumption is the collected values are comprehensive enough to 
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synthesize the looping guard. Although it was sufficient to repair the 14 infinite 
loops in our dataset, it may not always be the case. One way to mitigate this 
problem is to use other method calls when amassing variables for runtime value 
collection. Another way is to improve our runtime value collection technique. 
For instance, when we analyze a loop in a method of an anonymous class (this 
happens in Java), we do not include instance fields of the anonymous class. 


6 Related Work 


One seminal automatic repair technique is GenProg [T]. Genprog is generic by 
design, and may be applicable for repairing infinite loops. On the contrary, our 
repair method addresses a specific defect class (infinite loops). However, GenProg 
can only find a patch if the repair code already exists in the program, whereas we 
are able to genuinely synthesize a new expression. 

Dallmeier et al. [5] have presented Pachika, a fix generation approach via ob- 
ject behavior anomaly detection. This approach identifies the difference between 
program behaviors by the executions of passing and failing test cases; then fixes 
are generated by inserting or deleting method calls. Pachika does not fix loop 
conditions at all as Infinitel does. 

Kim et al. [11] proposed Par, a repair approach using fix patterns representing 
common ways of fixing common bugs in Java. These fix patterns can avoid the 
nonsensical patches due to the randomness of some mutation operators. None of 
the patterns are specific to infinite loops, and the evaluation does not mention 
any such bug. 

Another recent approach is SemFizx [13]. The SemFiax methodology consists 
of locating a suspicious assignment or conditional, and then executing the test 
cases with symbolic execution on that statement. The constraints resulting from 
symbolic execution are then used to identify a state-change that enables the test 
to pass. Then, code synthesis is also used to synthesize a code change. An 
extension of SemF tx for repairing infinite loops can be envisioned as follows: one 
can unfold the infinite loop before symbolic execution. This would lead to a path 
explosion, and it is yet unknown whether this would scale for real programs of 
size comparable with our dataset. 

NoPol is our previous work on automatic software repair [6]. NoPol also 
addresses a specific defect class: wrong conditionals or missing preconditions. Our 
technique In finitel follows an approach that is similar to the one used in NoPol, 
with two differences. Firstly, NoPol uses spectrum-based fault localization, which 
is not applicable for infinite loops. We use a completely different technique based 
on instrumentation to detect the actual infinite loop. Secondly, whereas NoPol 
finds a boolean angelic value, we search for an integer angelic record (the loop 
threshold), representing the minimum number of iterations for a hanging test 
case to complete and pass. 

Regarding one of our aforementioned loop properties, the phenomenon of 
“idempotent loops’, has also been observed by [I4]: “a specific instance of the 
loop can iterate for fewer or greater number of iterations without affecting pro- 
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gram output”. However, their goal is completely different from repair, they aim to 
characterize outcome-tolerant branch instances to discover new ways to enhance 
the processor’s performance. 

Burnim and colleagues |2| have presented an approach to detect infinite loops. 
They use symbolic execution for reasoning and implements it on top of Java. They 
only address detection and do not repair the infinite loop as we do in this paper. 
The same argument applies to the work of Ibing and Mai, which only focus on 
detection, in a static manner [9]. 

Finally, Jolt |3| is an approach to repair infinite loops at runtime. Jolt attaches 
to an application to audit its progress. It records the program state at the start of 
each loop iteration. If two consecutive loop iterations produce the same state, Jolt 
reports that the application is in an infinite loop. The key difference is that Jolt 
is at runtime, it simply escapes the loop without changing the looping condition. 
On the contrary, our approach is off-line and based on a hanging test and we are 
able to synthesize a new looping guard. 


7 Conclusion 


In this paper, we have proposed a novel method to automatically repair infinite 
loops. To this end, we have developed static and dynamic source code analysis 
techniques, along with a code synthesis technique based on SMT problems. Our 
method detects the location of the infinite loop, collects execution information 
about its expected behavior and eventually synthesizes a new loop and correct 
condition. 

For future work, we will explore whether our framework could be applied to fix 
a different yet related defect class: wrong loop conditions (which do not necessary 
result in infinite loops, but to incorrect output) and infinite recursion. 
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